-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of AttachResourceConfigTransformer on big graphs #35088
Conversation
Thanks for this submission! I'll raise it in triage, to see if we can review it for inclusion. Thanks! Note, please sign the CLA when you get a chance per the above comment. |
The use case is a bit of an outlier; it's more related to the migrations where we have many thousands of resources, like, Databricks directory, etc. Before the change:
After the change:
|
The current implementation of `AttachResourceConfigTransformer` has performance of `O(N^2)` to the number of vertices in the graph, and this leads to significant performance degradation on huge graphs. The given implementation decreases the complexity to `O(N)` by performing direct lookup of a resource in the `ManagedResources` or `DataResources` depending on the object type. Signed-off-by: Alex Ott <[email protected]>
30396fd
to
f0e2861
Compare
Thanks @alexott! This was never looked at for optimization, because you normally overwhelm other parts of the system long before this becomes a bottleneck. Is there anything particularly unusual about the configuration, other than having a huge number of individual resources blocks? |
Just big number of resources in the plan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a perfectly sound change, regardless of whether the performance problem you saw can be widely observed or not. Maps are for fetching stuff by key, so I'm in favor of fetching stuff out of the map by key. 😎
continue | ||
} | ||
|
||
coord := addr.Resource.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's "coord" mean in this context? 😅
I guess it doesn't matter, since we just use it as a map key on the next line and never observe it again, but the name was a bit disorienting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it as about coordinate, but key could be a good name as well
if config.Module.ProviderMetas == nil { | ||
log.Printf("[TRACE] AttachResourceConfigTransformer: no provider metas defined for %s", dag.VertexName(v)) | ||
continue | ||
if config.Module.ProviderMetas != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit deletes two additional nil checks (config and config.Module) — for posterity's sake, I'll note that they were both, indeed, useless. 👍🏼 (There's already an early exit on nil config above, and the existing code was already assuming that config.Module is never nil when it iterated over two of Module's fields.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yex. The config
deletion was highlighted by VSCode itself
Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch. |
@alexott @nfagerlund Just an FYI there is not currently a Changelog entry submitted for this change, I'd imagine one will be necessary for an eventual release. Thanks! |
I added one :) |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
The current implementation of
AttachResourceConfigTransformer
has performance ofO(N^2)
to the number of vertices in the graph, and this leads to significant performance degradation on huge graphs.The given implementation decreases the complexity to
O(N)
by performing direct lookup of a resource in theManagedResources
orDataResources
depending on the object type.Fixes #35087
Target Release
1.9.x
Draft CHANGELOG entry
NEW FEATURES | UPGRADE NOTES | ENHANCEMENTS | BUG FIXES | EXPERIMENTS